How to Create Uber-Fast Maps With Oracle 23ai Vector Tiles and H3 Indexes (Part 2)

Comments 0

Share to social media

Since this article was written, Oracle AI Database 26ai has been released.

Oracle Database 23ai added 300+ new features like the new VECTOR datatype that get most of the attention, but often overlooked are two additions that dramatically expand support for complex geospatial problem-solving.

In the prior article in this series, I demonstrated how Oracle 23ai’s vector tiles are a solution for improving the speed at which large volumes of map points are displayed within modern mapping applications.

However, rapid display and navigation of point clouds is just one challenge these applications need to solve. There’s also the need to aggregate information from those individual points and display those aggregations meaningfully.

That’s where the Hierarchical Hexagonal Indexing (H3) features in Oracle 23ai come in.

What is Hierarchical Hexagonal Indexing (H3)?

Aggregation is at the heart of Hierarchical Hexagonal Indexing (H3) which is a relatively new development and, unsurprisingly, came about from a real-world business use case.

Uber discovered that their drivers may not always know local geography well enough to locate their clients, pick them up, and deliver them to their desired destination – but there’s also the issue of assigning a driver to a particular ride request based on which drivers are closest to the rider when the request is made.

A graph showing a summary of H3 Spatial Indexing.
Figure 1: H3 Indexing Concepts

As Figure 1 shows, Uber’s solution was to develop and deploy hierarchical hexagonal indexing (H3) at the heart of its own geospatial mapping tools. Here’s a brief overview of how it works:

  • Unlike the solution we saw in the prior article – a series of square tiles regressively mapped within a Mercator projection to a set of squares only a few meters on a side – H3 uses a series of polygons to more accurately cover the entire surface of the earth.
  • They found that hexagons worked better than squares for mapping applications because a circle fits better within the sides of a hexagon, as well as touching the points of the hexagon. Circles are important because it’s easier to determine how many drivers are within a specific radius of a requesting rider’s location.
  • One wrinkle: Because the earth isn’t a perfect sphere, H3 at its lowest resolution actually required use of 12 pentagons plus 110 hexagons for accurate coverage. (Circles still cover more area within / around a pentagon than within a square.)
  • Each polygon contains a set of like polygons within it – so a hexagon encompasses seven other polygons, which in turn each contain seven other polygons, and so forth until the highest resolution level (23) is reached.
  • The real elegance of this strategy is that each polygon in this hierarchical structure knows exactly which polygon is its parent in the level above it (unless it’s at lowest resolution), which polygons are its siblings at the same resolution level, and which polygons are its children (unless it’s at the highest resolution).

Corralling Data Within H3 Indexed Boundaries

In the example query in Figure 2 (below), I used the SDO_UTIL.H3_KEY and SDO_UTIL.H3_BOUNDARY functions to return the corresponding H3_KEY and H3_BOUNDARY values based on longitude/latitude pair values at a specific resolution for the photovoltaic array data stored in table EXISTING_PV_SITES.

Figure 2: Using H3 Index Values in Simple Queries

Here’s what the query returned at resolution level 3 (Figure 3).

What the query returned at resolution level 3.
Figure 3: Result Set, Resolution Level 3

There’s a subtle difference when we compare what’s returned when the same query runs at resolution level 5 (Figure 4): the H3_KEY value for the same row is different because it represents a higher resolution.

Figure 4: Result Set, Resolution Level 5

Taking a look at the corresponding boundaries that correspond to an H3_KEY value, it becomes even clearer: It’s a hexagon with its six corners defined as specific longitude and latitude pairs. For example, Figure 5 shows the contents of the SDO_GEOMETRY column for H3_KEY value 085275323FFFFFFFF from the query results in Figure 4.

Figure 5: H3_BOUNDARY Values of a Specific H3_KEY Polygon

Grouping Disparate Data within H3 Indexed Boundaries

Another neat thing about H3_KEY values: Since they reference identical polygon boundaries (either pentagon or hexagon) at a particular resolution level, I can group aggregate data from multiple data sources at multiple resolution levels to provide hierarchical reporting for disparate data.

The query in Figure 6 shows exactly this strategy by first summarizing values within their derived H3_KEY for solar arrays and wind turbines at the same resolution level, then linking those aggregate values based on their matching H3_KEY attributes (i.e. within the same H3 boundary polygon).

Figure 6: Gathering Aggregate Data from Disparate Data Sources through Matching H3_KEY Values

The resulting report (Figure 7) shows aggregated data from both sources, aggregated within the H3_KEY level at resolution level 5. Of course, I could also make the resolution level a query bind variable, and that would let me aggregate disparate data at that specific resolution within the polygon boundaries returned.

The aggregated data report, showing aggregated data from both sources.
Figure 7: Aggregated Data Report

Using H3 Indexing Features in Mapping Applications

To leverage H3 indexing in a mapping application, I used SDO_UTIL procedure H3SUM_CREATE_TABLE to aggregate statistics related to H3 polygons into a single table.

The code in Figure 8 creates an H3 table named H3SUM_WIND_TURBINES from table EXISTING_WIND_TURBINES. I’m using that table as the data source because it has the highest number of disparate geospatial points spread out across the United States, so it’ll make mapping the aggregated data a bit more interesting.

Figure 8: Creating an H3 Table from Existing SDO_GEOMETRY Data

A few key points about how this H3 summary table was constructed:

  • It will contain a simple count of the number of turbines within each aggregate H3KEY resolution level, as the query in Figure 9 shows.
  • It will also contain SDO_GEOMETRY column aptly named KEY that describes the polygon containing the point cloud at that resolution.
  • Since individual wind turbines are still relatively sparse across a geography as vast as the USA, I’ve constrained the maximum H3 resolution to level 7 to limit the creation of polygons beyond that level.
  • The command to delete from USER_SDO_GEOM_METADATA simply removes the H3 index that gets created automatically whenever an H3 summary table is created.

Figure 9: Querying H3 Summary Table Contents

Just as I did in the prior article, I’ll transform the vector tiles that get produced from H3 Indexed data into something that mapping software can display using Oracle’s ORDS REST API toolset.

Figure 10 shows how I built an ORDS REST API module named wt_summary, defined a corresponding template that accepts variable values, and finally defined a handler that returns just the required appropriate vector tiles as a BLOB based on the parameter values specified.

Figure 10: ORDS REST API Module, Template, and Handler for EXISTING_WIND_TURBINES Vector Tiles

This ORDS module, template, and handler are nearly identical to the examples I provided for vector tiles in the prior article, but there’s one interesting wrinkle for the zoom level bind variable (:z). Since H3 resolution is more fine-grained than vector tiles created via the GET_VECTORTILE procedure, the CASE statement in the ORDS handler ratchets up the zoom level appropriately to a slightly higher value. Otherwise, the polygons returned from the H3SUM_VECTORTILE procedure would be so small they’d be unlikely to contain any points.

Displaying H3 Summary Vector Tiles Via MapLibre GL

Since the H3SUM_VECTORTILE procedure returns vector tiles, I can reuse the majority of my code base I had already built to display them using MapLibre GL. I’ve retained the same settings for initial map position, initial zoom level and base map style; the only modifications I needed to make accommodate the wt_summary ORDS API module. Figure 11 shows the main differences:

An image showing the modifications needed to accommodate the wt_summary ORDS API module.
Figure 11: MapLibre Live Server Specification: Adding the H3 Summary Vector Tiles as a Map Layer
  • The code in the red box accesses the ORDS endpoint for the existing wind turbines I created in Figure 11. Just as before, it connects with the MapLibreGL interface to move the display to selected vector tile contents at the corresponding zoom level for the corresponding X and Y coordinates.
  • The code in the blue box adds the layer to the MapLibreGL map. I tweaked the code here a bit to display a different color for polygons based on the total count of points captured within each polygon, ranging from light green for less than 10 points, yellow for 10-14 points, purple for 15-19 points, and red for 20 points or more.
  • The code in the green box displays a pop-up when one of the polygons is clicked; it contains just the count of points captured within the polygon, since that’s the only aggregate data returned for H3 summary vector tiles.

And what do the corresponding maps look like? Let’s start at nearly the lowest display level of 10 km resolution (Figure 12), which shows concentrations of wind turbines in south-central Wisconsin in the USA. Note the yellow polygon contains 11 wind turbines, providing a visual clue that area has a higher concentration than the other light-green polygons nearby:

A map showing the concentrations of wind turbines in south-central Wisconsin in the USA.
Figure 12: H3 Summary Vector Tiles at 10km Resolution

Zooming out to 30km resolution (Figure 13), the color coding of the polygons starts to make a bit more sense. Even with fewer polygons, we’re still able to identify different concentrations of wind turbines. Note the pop-up for the red polygon reflects the higher number within (48) than others nearby:

Another map of wind turbines in Wisconsin, USA.
Figure 13: H3 Summary Vector Tiles at 30km Resolution

What does the view at a nearly-continental resolution level look like? At 500km (Figure 14), it’s easy to see the Great Plains of the United States is clearly the champion of wind generation for the country:

500km resolution map of the United States.
Figure 14: H3 Summary Vector Tiles at 500km Resolution

And finally, at 1000km resolution (Figure 15), we’re able to view the data from the minimum zoom level:

1000km resolution map of the United States and surrounding countries.
Figure 15: H3 Summary Vector Tiles at 1000km Resolution

Vector Tile Caching

While vector tiles are easily and speedily generated in the examples I’ve presented, what about a scenario with much larger point clouds? For example, imagine a scenario for a financial institution tracking hundreds of thousands of bank branches or automated teller machines (ATMs) around the world.

If I could retain the vector tiles generated for the most commonly accessed points, that could save an enormous amount of compute time whenever a mapping application needed to display the tiles containing those points.

The good news is it’s easy to enable vector tile caching with a few simple calls to SDO_UTIL procedures. Figure 16 shows how I enabled caching for the GEOMETRY column in tables EXISTING_EV_CHARGERS and EXISTING_PV_SITES with the ENABLE_VECTORTILE_CACHE procedure:

Figure 16: Enabling Vector Tile Caching at Different Zoom Levels

A few points to consider from this example:

  • Since there are considerably fewer solar array sites than EV chargers, I’ve stopped their caching at zoom level 15; any vector tiles at levels 16-23 would not be cached.
  • Since EV chargers would likely be searched for at deeper zoom levels, I’m permitting them to be cached at the maximum zoom level (23).
  • If there was an additional column of datatype SDO_GEOMETRY in either table, it would not be cached unless specifically mentioned.

Once vector tile caching is enabled, I can query view SDO_VECTOR_TILE_CACHE$INFO to view which tables containing which columns of data type SDO_GEOMETRY are enabled for vector tile caching (Figure 17):

An image showing which tables containing columns of data type SDO_GEOMETRY are enabled for vector tile caching.
Figure 17: Viewing Metadata for Cached Vector Tiles

Figure 18 shows that vector tile caching is actually happening. A simple query against the SDO_VECTOR_TILE_CACHE$TABLE view shows which vector tiles have been cached as a result of queries for one or more specific vector tiles – in this case, the tile located at (X,Y) = (129,187) at zoom level 9:

Image showing that vector tile caching is actually happening.
Figure 18: Viewing Metadata for Cached Vector Tiles

Controlling Access to Cached Vector Tiles

I also have the power to limit which user accounts are allowed to leverage cached vector tiles. The code blocks in Figure 19 show two examples of access control for vector tiles:

Figure 19: GRANTing or REVOKing Access to Existing Vector Tile Caches

  • The first code block grants the GEOSWARM user account access to all vector tile caches owned by the executing schema via procedure via SDO_UTIL.GRANT_VECTORTILE_CACHE.
  • The second code block does the opposite: it revokes access to any vector tiles owned by the executing schema to the HRADM user account.

Deactivating and Purging Vector Tile Caching

Finally, I have the power to deactivate vector tile caching and even purge cached vector tiles, as shown in Figure 20:

Figure 20: Disabling Caching and Purging Already-Cached Vector Tiles

  • In the first code block, I disable vector tile caching for specific SDO_GEOMETRY datatype columns that could be used to generate vector tiles via SDO_UTIL.DISABLE_VECTORTILE_CACHE.  Note that if the EXISTING_WIND_TURBINES table had another SDO_GEOMETRY column already being cached for that table, those vector tiles would remain cached.
  • Likewise, as the second code block shows, I can purge any cached vector tiles for a specific table via SDO_UTIL.PURGE_VECTORTILE_CACHE.

Figure 20: Disabling Caching and Purging Already-Cached Vector Tiles

Final Thoughts

I hope these two articles have enlightened you on the new mapping capabilities that Oracle Database 23ai presents, especially if you’re contemplating the deployment of an advanced geospatial solution.

The flexibility and rapid retrieval they offer should make a compelling case for leveraging your existing Oracle Database licensing as your go-to data platform before considering the adoption of other specialized geospatial databases.

Download the code featured in this article
If you’d like to experiment with the APEX 24.2 app and the MapBox examples to explore the mapping features presented in this article, you can download all application code samples here.

Subscribe to the Simple Talk newsletter

Get selected articles, event information, podcasts and other industry content delivered straight to your inbox every two weeks.
Subscribe now

Article tags

Load comments

<span class="icon--download"></span>Downloads

About the author

Jim Czuprynski

See Profile

Jim Czuprynski has 40+ years of professional experience in Information Technology, serving diverse roles at various Fortune 1000 companies before becoming an Oracle DBA in 2001. An Oracle ACE Director since 2014, Jim is a sought-after public speaker on Oracle technology, presenting often at Oracle CloudWorld, ODTUG Kscope, Oracle Development Community tours, and Oracle User Group conferences around the world. Jim has authored 100+ articles on Oracle Database administration, ML/Analytics, and APEX since 2003 and has co-authored four books on Oracle database technology. Jim’s blog, Generally … It Depends (https://jimthewhyguy.com/blog), contains his regular observations on all things Oracle and the state of the IT industry. Find Jim on BlueSky: @jimthewhyguy.bsky.social‬